Selection criteria for hypothesis driven lexical adaptation

نویسندگان

  • Petra Geutner
  • Michael Finke
  • Alexander H. Waibel
چکیده

Adapting the vocabulary of a speech recognizer to the utterance to be recognized has proven to be successful both in reducing high out-of-vocabulary as well as word error rates. This applies especially to languages that have a rapid vocabulary growth due to a large number of inflections and composita. This paper presents various adaptation methods within the Hypothesis Driven Lexical Adaptation (HDLA) framework which allow speech recognition on a virtually unlimited vocabulary. Selection criteria for the adaptation process are either based on morphological knowledge or distance measures at phoneme or grapheme level. Different methods are introduced for determining distances between phoneme pairs and for creating the large fallback lexicon the adapted vocabulary is chosen from. HDLA reduces the out-of-vocabulary-rate by 55% for Serbo-Croatian, 35% for German and 27% for Turkish. The reduced out-of-vocabulary rate also decreases the word error rate by an absolute 4.1% to 25.4% on Serbo-Croatian broadcast news data.

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

منابع مشابه

Transcribing Multilingual Broadcast News Using Hypothesis Driven Lexical Adaptation

This paper describes first results of our DARPA-sponsored efforts toward recognizing and browsing foreign language, more specifically, Serbo-Croatian broadcast news. For Serbo-Croatian as well as many other than the most common well studied languages, the problems of broadcast quality recognition are complicated by 1.) the lack of available acoustic and language data, and 2.) the excessive voca...

متن کامل

Phonetic-distance-based hypothesis driven lexical adaptation for transcribing multlingual broadcast news

High out-of-vocabulary (OOV) rates are one of the most prevailing problems for languages with a rapid vocabulary growth due to a large number of inflections. Especially when transcribing SerboCroatian and German broadcast news, the OOV-rate is between 8.7% and 4.5%. Hypothesis Driven Lexical Adaptation (HDLA) has already been shown to decrease high OOV-rates significantly by using morphology-ba...

متن کامل

The production of lexical categories (VP) and functional categories (copula) at the initial stage of child L2 acquisition

This is a longitudinal case study of two Farsi-speaking children learning English: ‘Bernard’ and ‘Melissa’, who were 7;4 and 8;4 at the start of data collection. The research deals with the initial state and further development in the child second language (L2) acquisition of syntax regarding the presence or absence of copula as a functional category, as well as the role and degree of L1 influe...

متن کامل

A Model for Standardization/Adaptation Strategy Selection in the Irans Multinational Companies (MNCs)

Purpose-The research aims at evaluating the standardization/adaptation of international marketing strategy in Iran multinational companies (MNCs) based a model in which the impact of external environmental variables on the marketing mix internal variables (i.e. Product, Promotion, Price and Place) is considered, while in the previous researches no attempt was done to examine the interdepende...

متن کامل

DeepPurple: Lexical, String and Affective Feature Fusion for Sentence-Level Semantic Similarity Estimation

This paper describes our submission for the *SEM shared task of Semantic Textual Similarity. We estimate the semantic similarity between two sentences using regression models with features: 1) n-gram hit rates (lexical matches) between sentences, 2) lexical semantic similarity between non-matching words, 3) string similarity metrics, 4) affective content similarity and 5) sentence length. Domai...

متن کامل

ذخیره در منابع من


  با ذخیره ی این منبع در منابع من، دسترسی به آن را برای استفاده های بعدی آسان تر کنید

برای دانلود متن کامل این مقاله و بیش از 32 میلیون مقاله دیگر ابتدا ثبت نام کنید

ثبت نام

اگر عضو سایت هستید لطفا وارد حساب کاربری خود شوید

عنوان ژورنال:

دوره   شماره 

صفحات  -

تاریخ انتشار 1999